Goto

Collaborating Authors

 Marodi Jeh


Toward Micro-Dialect Identification in Diaglossic and Code-Switched Environments

Abdul-Mageed, Muhammad, Zhang, Chiyu, Elmadany, AbdelRahim, Ungar, Lyle

arXiv.org Artificial Intelligence

Although the prediction of dialects is an important language processing task, with a wide range of applications, existing work is largely limited to coarse-grained varieties. Inspired by geolocation research, we propose the novel task of Micro-Dialect Identification (MDI) and introduce MARBERT, a new language model with striking abilities to predict a fine-grained variety (as small as that of a city) given a single, short message. For modeling, we offer a range of novel spatially and linguistically-motivated multi-task learning models. To showcase the utility of our models, we introduce a new, large-scale dataset of Arabic micro-varieties (low-resource) suited to our tasks. MARBERT predicts micro-dialects with 9.9% F1, ~76X better than a majority class baseline. Our new language model also establishes new state-of-the-art on several external tasks.



How I went to Somaliland and… Taught Artificial Intelligence

#artificialintelligence

TL;DR: I had a pleasure to be a part of the first AI conference in Somaliland, organised by Shaqodoon, HarHub and Elmi Academy, and featuring speakers from Google, MIT, major Somaliland telecoms, banks, University of Hargeisa and Ministry of Telecommunication & Technology of Somaliland. See slides (lectures workshops) and event program for details. Below are my personal notes and pictures from this trip. Whenever I tell this story people seem to be surprised with choice of spending vacation time in Somaliland and running an AI-related event there. So let me share some first-hand experience with you and explain why trips and events like this are useful, fun and safe.